Hierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-n Grammars
نویسندگان
چکیده
In this article we describe HiFST, a lattice-based decoder for hierarchical phrase-based translation and alignment. The decoder is implemented with standard Weighted Finite-State Transducer (WFST) operations as an alternative to the well-known cube pruning procedure. We find that the use of WFSTs rather than k-best lists requires less pruning in translation search, resulting in fewer search errors, better parameter optimization, and improved translation performance. The direct generation of translation lattices in the target language can improve subsequent rescoring procedures, yielding further gains when applying long-span language models and Minimum Bayes Risk decoding. We also provide insights as to how to control the size of the search space defined by hierarchical rules. We show that shallow-n grammars, low-level rule catenation, and other search constraints can help to match the power of the translation system to specific language pairs.
منابع مشابه
Hierarchical phrase-based translation with weighted finite state transducers
This dissertation is focused in the Statistical Machine Translation field (SMT), particularly in hierarchical phrase-based translation frameworks. We first study and redesign hierarchical models using several filtering techniques. Hierarchical search spaces are based on automatically extracted translation rules. As originally defined they are too big to handle directly without filtering. In thi...
متن کاملHierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-<italic>n</italic> Grammars
In this article we describe HiFST, a lattice-based decoder for hierarchical phrase-based translation and alignment. The decoder is implemented with standard Weighted Finite-State Transducer (WFST) operations as an alternative to the well-known cube pruning procedure. We find that the use of WFSTs rather than k-best lists requires less pruning in translation search, resulting in fewer search err...
متن کاملIntersecting Hierarchical and Phrase-Based Models of Translation: Formal Aspects and Algorithms
We address the problem of constructing hybrid translation systems by intersecting a Hiero-style hierarchical system with a phrase-based system and present formal techniques for doing so. We model the phrase-based component by introducing a variant of weighted finite-state automata, called σ-automata, provide a self-contained description of a general algorithm for intersecting weighted synchrono...
متن کاملA phrase-level machine translation approach for disfluency detection using weighted finite state transducers
We propose a novel algorithm to detect disfluency in speech by reformulating the problem as phrase-level statistical machine translation using weighted finite state transducers. We approach the task as translation of noisy speech to clean speech. We simplify our translation framework such that it does not require fertility and alignment models. We tested our model on the Switchboard disfluency-...
متن کاملImproved Reordering for Shallow-n Grammar based Hierarchical Phrase-based Translation
Shallow-n grammars (de Gispert et al., 2010) were introduced to reduce over-generation in the Hiero translation model (Chiang, 2005) resulting in much faster decoding and restricting reordering to a desired level for specific language pairs. However, Shallow-n grammars require parameters which cannot be directly optimized using minimum error-rate tuning by the decoder. This paper introduces som...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational Linguistics
دوره 36 شماره
صفحات -
تاریخ انتشار 2010